Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic

نویسندگان

Somak Aditya

Yezhou Yang

Chitta Baral

Yiannis Aloimonos

چکیده

In this work, we explore a genre of puzzles (“image riddles”) which involves a set of images and a question. Answering these puzzles require both capabilities involving visual detection (including object, activity recognition) and, knowledge-based or commonsense reasoning. We compile a dataset of over 3k riddles where each riddle consists of 4 images and a groundtruth answer. The annotations are validated using crowd-sourced evaluation. We also define an automatic evaluation metric to track future progress. Our task bears similarity with the commonly known IQ tasks such as analogy solving, sequence filling that are often used to test intelligence. We develop a Probabilistic Reasoning-based approach that utilizes probabilistic commonsense knowledge to answer these riddles with a reasonable accuracy. We demonstrate the results of our approach using both automatic and human evaluations. Our approach achieves some promising results for these riddles and provides a strong baseline for future attempts. We make the entire dataset and related materials publicly available to the community in ImageRiddle Website (http://bit.ly/22f9Ala).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-ofthe-art systems attempted to solve the task using deep neural architectures and achieved promising performance. ...

متن کامل

Explicit Reasoning over End-to-End Neural Architectures

متن کامل

Image Understanding using Vision and Reasoning through Scene Description Graph

Two of the fundamental tasks in image understanding using text are caption generation and visual question answering [1, 2]. This work presents an intermediate knowledge structure that can be used for both tasks to obtain increased interpretability. We call this knowledge structure Scene Description Graph (SDG), as it is a directed labeled graph, representing objects, actions, regions, as well a...

متن کامل

Load-Frequency Control: a GA based Bayesian Networks Multi-agent System

Bayesian Networks (BN) provides a robust probabilistic method of reasoning under uncertainty. They have been successfully applied in a variety of real-world tasks but they have received little attention in the area of load-frequency control (LFC). In practice, LFC systems use proportional-integral controllers. However since these controllers are designed using a linear model, the nonlinearities...

متن کامل

Graph Summarization in Annotated Data Using Probabilistic Soft Logic

Annotation graphs, made available through the Linked Data initiative and Semantic Web, have significant scientific value. However, their increasing complexity makes it difficult to fully exploit this value. Graph summaries, which group similar entities and relations for a more abstract view on the data, can help alleviate this problem, but new methods for graph summarization are needed that han...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1611.05896 شماره

صفحات -

تاریخ انتشار 2016

Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic

نویسندگان

چکیده

منابع مشابه

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

Explicit Reasoning over End-to-End Neural Architectures

Image Understanding using Vision and Reasoning through Scene Description Graph

Load-Frequency Control: a GA based Bayesian Networks Multi-agent System

Graph Summarization in Annotated Data Using Probabilistic Soft Logic

عنوان ژورنال:

اشتراک گذاری